Goto

Collaborating Authors

 Vista



Enhancing Road Safety Through Multi-Camera Image Segmentation with Post-Encroachment Time Analysis

arXiv.org Artificial Intelligence

Abstract--Traffic safety analysis at signalized intersections is vital for reducing vehicle and pedestrian collisions, yet traditional crash-based studies are limited by data sparsity and latency. This paper presents a novel multi-camera computer vision framework for real-time safety assessment through Post-Encroachment Time (PET) computation, demonstrated at the intersection of H Street and Broadway in Chula Vista, California. Four synchronized cameras provide continuous visual coverage, with each frame processed on NVIDIA Jetson AGX Xavier devices using YOLOv11 segmentation for vehicle detection. Detected vehicle polygons are transformed into a unified bird's-eye map using homography matrices, enabling alignment across overlapping camera views. A novel pixel-level PET algorithm measures vehicle position without reliance on fixed cells, allowing fine-grained hazard visualization via dynamic heatmaps, accurate to 3.3 sq-cm. Timestamped vehicle and PET data is stored in an SQL database for long-term monitoring. Results over various time intervals demonstrate the framework's ability to identify high-risk regions with sub-second precision and real-time throughput on edge devices, producing data for an 800 800 pixel logarithmic heatmap at an average of 2.68 FPS. A. Context and Motivation Traffic safety at signalized intersections remains a critical concern in urban planning, as intersections present challenges of high vehicle conflict and elevated accident risk. Large and open intersections, in particular, present challenges due to increased vehicle maneuvering space, multiple conflict points, and reduced natural traffic control, which leads to higher speeds and greater uncertainty in driver behavior.



Police tech can sidestep facial recognition bans now

MIT Technology Review

Companies like Flock and Axon sell suites of sensors--cameras, license plate readers, gunshot detectors, drones--and then offer AI tools to make sense of that ocean of data (at last year's conference I saw schmoozing between countless AI-for-police startups and the chiefs they sell to on the expo floor). Departments say these technologies save time, ease officer shortages, and help cut down on response times. Those sound like fine goals, but this pace of adoption raises an obvious question: Who makes the rules here? When does the use of AI cross over from efficiency into surveillance, and what type of transparency is owed to the public? In some cases, AI-powered police tech is already driving a wedge between departments and the communities they serve.


A Sparse Bayesian Learning Algorithm for Estimation of Interaction Kernels in Motsch-Tadmor Model

arXiv.org Machine Learning

In this paper, we investigate the data-driven identification of asymmetric interaction kernels in the Motsch-Tadmor model based on observed trajectory data. The model under consideration is governed by a class of semilinear evolution equations, where the interaction kernel defines a normalized, state-dependent Laplacian operator that governs collective dynamics. To address the resulting nonlinear inverse problem, we propose a variational framework that reformulates kernel identification using the implicit form of the governing equations, reducing it to a subspace identification problem. We establish an iden-tifiability result that characterizes conditions under which the interaction kernel can be uniquely recovered up to scale. To solve the inverse problem robustly, we develop a sparse Bayesian learning algorithm that incorporates informative priors for regularization, quantifies uncertainty, and enables principled model selection. Extensive numerical experiments on representative interacting particle systems demonstrate the accuracy, robustness, and interpretability of the proposed framework across a range of noise levels and data regimes.


Abstracting Geo-specific Terrains to Scale Up Reinforcement Learning

arXiv.org Artificial Intelligence

ABSTRACT Multi - agent reinforcement learning (MARL) is increasingly ubiquitous in training dynamic and adaptive synthetic characters for interactive simulations on geo - specific terrains. Frameworks such as Unity's ML - Agents help to make such reinforcement learning e xperiments more accessible to the simulation community. Military training simulations also benefit from advances in MARL, but they have immense computational requirements due to their complex, continuous, stochastic, partially observable, non - stationary, a nd doctrine - based nature. Furthermore, these simulations require geo - specific terrains, further exacerbating the computational resources problem. In our research, we leverage Unity's waypoints to automatically generate multi - layered representation abstract ions of the geo - specific terrains to scale up reinforcement learning while still allowing the transfer of learned policies between different representations. Our early exploratory results on a novel MARL scenario, where each side has differing objectives, indicate that waypoint - based navigation enables faster and more efficient learning while producing trajectories similar to those taken by expert human players in CSGO gaming environments. This research points out the potential of waypoint - based navigation for reducing the computational costs of developing and training MARL models for military training simulations, where geo - specific terrains and differing objectives are crucial. ABOUT THE AUTHORS Volkan Ustun is the Associate Director of the Human - Inspired Adaptive Teaming Systems Group at the USC I nstitute for Creative Technologies .


Human-Robot Dialogue Annotation for Multi-Modal Common Ground

arXiv.org Artificial Intelligence

In this paper, we describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners. A particular challenge for establishing common ground arises in remote dialogue (occurring in disaster relief or search-and-rescue tasks), where a human and robot are engaged in a joint navigation and exploration task of an unfamiliar environment, but where the robot cannot immediately share high quality visual information due to limited communication constraints. Engaging in a dialogue provides an effective way to communicate, while on-demand or lower-quality visual information can be supplemented for establishing common ground. Within this paradigm, we capture propositional semantics and the illocutionary force of a single utterance within the dialogue through our Dialogue-AMR annotation, an augmentation of Abstract Meaning Representation. We then capture patterns in how different utterances within and across speaker floors relate to one another in our development of a multi-floor Dialogue Structure annotation schema. Finally, we begin to annotate and analyze the ways in which the visual modalities provide contextual information to the dialogue for overcoming disparities in the collaborators' understanding of the environment. We conclude by discussing the use-cases, architectures, and systems we have implemented from our annotations that enable physical robots to autonomously engage with humans in bi-directional dialogue and navigation.


A Survey of Conversational Search

arXiv.org Artificial Intelligence

As a cornerstone of modern information access, search engines have become indispensable in everyday life. With the rapid advancements in AI and natural language processing (NLP) technologies, particularly large language models (LLMs), search engines have evolved to support more intuitive and intelligent interactions between users and systems. Conversational search, an emerging paradigm for next-generation search engines, leverages natural language dialogue to facilitate complex and precise information retrieval, thus attracting significant attention. Unlike traditional keyword-based search engines, conversational search systems enhance user experience by supporting intricate queries, maintaining context over multi-turn interactions, and providing robust information integration and processing capabilities. Key components such as query reformulation, search clarification, conversational retrieval, and response generation work in unison to enable these sophisticated interactions. In this survey, we explore the recent advancements and potential future directions in conversational search, examining the critical modules that constitute a conversational search system. We highlight the integration of LLMs in enhancing these systems and discuss the challenges and opportunities that lie ahead in this dynamic field. Additionally, we provide insights into real-world applications and robust evaluations of current conversational search systems, aiming to guide future research and development in conversational search.


Artificial Intelligence-Based Triaging of Cutaneous Melanocytic Lesions

arXiv.org Artificial Intelligence

Pathologists are facing an increasing workload due to a growing volume of cases and the need for more comprehensive diagnoses. Aiming to facilitate workload reduction and faster turnaround times, we developed an artificial intelligence (AI) model for triaging cutaneous melanocytic lesions based on whole slide images. The AI model was developed and validated using a retrospective cohort from the UMC Utrecht. The dataset consisted of 52,202 whole slide images from 27,167 unique specimens, acquired from 20,707 patients. Specimens with only common nevi were assigned to the low complexity category (86.6%). In contrast, specimens with any other melanocytic lesion subtype, including non-common nevi, melanocytomas, and melanomas, were assigned to the high complexity category (13.4%). The dataset was split on patient level into a development set (80%) and test sets (20%) for independent evaluation. Predictive performance was primarily measured using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). A simulation experiment was performed to study the effect of implementing AI-based triaging in the clinic. The AI model reached an AUROC of 0.966 (95% CI, 0.960-0.972) and an AUPRC of 0.857 (95% CI, 0.836-0.877) on the in-distribution test set, and an AUROC of 0.899 (95% CI, 0.860-0.934) and an AUPRC of 0.498 (95% CI, 0.360-0.639) on the out-of-distribution test set. In the simulation experiment, using random case assignment as baseline, AI-based triaging prevented an average of 43.9 (95% CI, 36-55) initial examinations of high complexity cases by general pathologists for every 500 cases. In conclusion, the AI model achieved a strong predictive performance in differentiating between cutaneous melanocytic lesions of high and low complexity. The improvement in workflow efficiency due to AI-based triaging could be substantial.


What's next for drones

MIT Technology Review

These developments raise a number of questions: Are drones safe enough to be flown in dense neighborhoods and cities? Is it a violation of people's privacy for police to fly drones overhead at an event or protest? Who decides what level of drone autonomy is acceptable in a war zone? Those questions are no longer hypothetical. Advancements in drone technology and sensors, falling prices, and easing regulations are making drones cheaper, faster, and more capable than ever.